SGI Developer Toolbox 6.1

home *** CD-ROM | disk | FTP | other *** search

/ SGI Developer Toolbox 6.1 / SGI Developer Toolbox 6.1 - Disc 2.iso / toolbox / perfTuning / gldebug.notes.txt < prev next >

Wrap

Text File | 1996-11-11 | 9KB | 221 lines

Great tool for a quick examination of a program. Only tool if you do not have access to the source. Useful for a sanity check of the appilcation. Is the programmer telling the truth about their code. o GLdebug can be used both to debug and to tune: - tells you what graphics calls are being issued - look for lots of mode changes, or unnecesssary mode settings - verify subpixel(1); glcompat(GLC_OLDPOLYGON, FALS); - check on: shademodel(FLAT/GOURAUD) infinite vs lights or LOCALVIEWER set in the Light Model two sided lighting mode changes: frequent calls to shademodel, zbuffer, blendfunction use of lmdef instead of lmcolor - check for duplicate data (often seen in normals and colors with flat-shading) - or unnecessary vertex bindings, such as uneeded per-vertex colors or normals for a flat shaded object. * must be single process to run gldebug * using ignore files will simplify gldebug output o What warning message are printed? Explanation of Options: ----------------------- -h no history output. [run this option when you only want to see use the Stateviewer] -w no warning output. -e no error output. -f no fatal error output. -c do not run Controller. -s do not run Stateviewer. [run these options when you only want to see the output history, i.e. when you are looking for known bad habits which degrade performance. It may be useful to generate a history file in one pass then run the Stateviewer while examining the output.] -C generate C code in history file. [this is not very useful as the code never looks like the application. One may be able to reconstruct a bug without copying an unmanagably large size of application code. Also useful for producing a benchmark of the code in the application. This does not produce code which will compile.] -F flush output buffer to history file after each GL call. -p wait profile (output the number of times each GL function is called). wait is the number of GL calls wait between each profile write to file. Profile output goes to GLdebug.count. -i filename ignore the GL functions listed in filename when writing output. filename should contain GL function names listed one per line. [very useful for supressing commands which carry lots of data like texdef, defpattern, v3f, etc.] -o filename send history trace output to filename. Default is GLdebug.history. -O send history trace output to stdout. This overrides -o filename. [not very useful, history files are always big] o Useful alias: gldebug -i ~/gldignore -sF gldignore: ----------- qread getmatrix defpattern texdef Taking a GLdebug Trace: ----------------------- gldebug session to grab one frame: - start up gldebug - turn of output and breakpoints - set breakpoint at swapbuffers - go to interesting frame - turn on breakpoints > will stop at swapbuffers - turn on output - continue > will stop at swapbuffers, outputing one full scene to GLdebug.history - quit and look at output * note: grabbing one frame will show stuff set that frame but will not reflect modes that were set previously. therefore, it is best ot have a program that can come up in the desired location and with the desired modes and then grab the first 2 frames: 1 for initialization, one for continued drawing. EXAMPLE: -------- > Vince, > > Here is a chunck of the output. Note that a number of different > techniques are used for drawing the models within the scene. So this > is only representative of a subset of the drawing (e.g. I don't even > know if any of the models in this section have textures turned on). > While working with Frank he thought that their code was finely tuned for VGX. He said something about a team of programmers working on the code for 10 man years. At first i had little confidence that we could improve his code, but i think we have found room for improvement. First as you both know if you move an app from the VGX to RE and see no improvement it probably means that you have a CPU bottleneck or something really stupid is being done in the graphics code. Unfortunately, we do not know what all of those "stupid" things are on RE yet. Also, in the demo that he is running there was no texture mapping. i am not surprise to learn that there was little improvement for non-texture mapped primitives. For standard phong lighted, Z-buffered, non-textured primitives the performance is about the same. The flat-shaded tmesh performance is exactly the same. The Gouraud shaded tmesh performance is about 10% higher on a RE. The biggest improvement comes with independent Gouraud shaded quads, about 33%. Turn on texturing and you get a big win. i helped Frank generate one frame of gldebug output and asked him to send me the file. A quick glance at the data reveals 3 sets of superfluous calls to the GL only 2 of which could impact performance. The improvements made here should result in improved performance on both VGX and RE since it will reduce the CPU bottleneck. 1) n3f 2) lmbind 3) misc. 1) The biggest problem is with duplicated normals. One trick to remember is that the hardware caches the normal and provides a copy with any subsequent vertexes that are sent without normals while lighting is enabled. If you look at the tmeshes you will notice 50 - 90+ % of the normal data is duplicated. Note i suppressed the gldebug output of v3f commands. If you look at the first FLAT shaded tmesh which has 12 vertexes you will see that 12 identical normals are being copied. That is 50% more data than necessary. Since lighting was enabled the same normal was also transformed for each copy. Furthermore, if multiple objects share the same normal it need only be sent once. This change may require rebuilding of the database. 2) It appears that every new lmbind call is preceded by a call to lmbind(MATERIAL, 0) which disables lighting. This is only necessary if they wish to draw an unlighted object. This is inefficeint toggling of modes. The RealityEngine is very sensitive to mode changes. Remove that lmbind. 3) There appear to be calls to things that are never used. e.g. getmatrix(), getpattern(), the query calls can be expensive because they are copying data back to the host or often have to go into feedback mode. Finally, it would be helpful to see some prof/pixie output from their program to verify this. If we are truely experiencing a CPU botteneck then you should see gl_i_v3f and gl_i_n3f listed at the very top of the pixie readings. good luck and i hope this helps, vince > getpattern(); > getmatrix(OUT); > lmbind(MATERIAL, 0); > lmbind(MATERIAL, 5); > shademodel(GOURAUD); > bgntmesh(); > n3f({1.000000, 0.000000, 0.000000}); > n3f({1.000000, 0.000000, 0.000000}); > n3f({0.500000, 0.797443, -0.337763}); > n3f({0.500000, 0.797443, -0.337763}); > n3f({-0.500000, 0.797443, -0.337763}); > n3f({-0.500000, 0.797443, -0.337763}); > n3f({-1.000000, 0.000000, 0.000000}); > n3f({-1.000000, 0.000000, 0.000000}); > n3f({-0.500000, -0.797443, 0.337763}); > n3f({-0.500000, -0.797443, 0.337763}); > n3f({0.500000, -0.797443, 0.337763}); > n3f({0.500000, -0.797443, 0.337763}); > n3f({1.000000, 0.000000, 0.000000}); > n3f({1.000000, 0.000000, 0.000000}); > endtmesh(); > lmbind(MATERIAL, 0); > lmbind(MATERIAL, 6); > shademodel(FLAT); > bgntmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > swaptmesh(); > n3f({-1.000000, 0.000000, 0.000000}); > endtmesh();